PIRCS: a Network-Based Document Routing and Retrieval System
نویسنده
چکیده
Our objective is to enhance the effectiveness and efficiency of ad hoe and routing retrieval for large scale textbases. Effective retrieval means ranking relevant answer documents of a user's information need high on the output list. Our text processing and retrieval system PIRCS ( Probabilisitc Indexing and Retrieval -ComponentsSystem) handles English text in a domain independent fashion, and is based on a probabilistic model but extended with the concept of document components as discussed in our last year's site report. Our focus for enhancing effectiveness remains on three areas: 1) improvements on document representation; 2) combination of retrieval algorithms; and 3) network implementation with learning capabilities. Using representation with more restricted contexts such as phrases or subdocument units help to decrease ambiguity in both retrieval and learning. Combining evidences from different retrieval algorithms is known to improve results. Viewing retrieval in a network helps to implement queryfocused and document-focused retrieval and feedback, as well as query expansion.
منابع مشابه
Information Retrieval from Large Textbases
Our objective is to enhance the effectiveness of retrieval and routing operations for large scale textbases. Retrieval concerns the processing of ad hoc queries against a static document collection, while muting concerns the processing of static, trained queries against a document stream. Both may be viewed as trying to rank relevant answer documents high in the output. Our text processing and ...
متن کاملTREC-2 Document Retrieval Experiments using PIRCS
We performed the full experiments, using our network implementation of component probabilistic indexing and retrieval model. Documents were enhanced with a list of semi-automatically generated two-word phrases, and queries with automatic Boolean expressions. An item self-learning procedure was used to initiate network edge weights for retrieval. Initial results submitted were above median for a...
متن کاملTREC-8 Ad-Hoc, Query and Filtering Track Experiments using PIRCS
In TREC-8, we participated in automatic ad-hoc retrieval as well as the query and filtering tracks. The theme of our participation is ‘retrieval lists combination’, and the technique is applied throughout our experiments to various degree. It is pointed out that our PIRCS system may be considered as a combination of probabilistic retrieval model and a language model approach. For adhoc, three t...
متن کاملTREC-3 Ad-Hoc, Routing Retrieval and Thresholding Experiments using PIRCS
The PIRCS retrieval system has been upgraded in TREC-3 to handle the full English collections of 2 GB in an efficient manner. For ad-hoc retrieval, we use recurrent spreading of activation in our network to implement query learning and expansion based on the best-ranked subdocuments of an initial retrieval. We also augment our standard retrieval algorithm with a soft-Boolean component. For rout...
متن کاملTREC-7 Ad-Hoc, High Precision and Filtering Experiments using PIRCS
In TREC-7, we participated in the main task of automatic ad-hoc retrieval as well as the high precision and filtering tracks. For ad-hoc, three experiments were done with query types of short (title section of a topic), medium (description section) and long (all sections) lengths. We used a sequence of five methods to handle the short and medium length queries. For long queries we employed a re...
متن کامل